Chinese Sketch Engine and the Extraction of Grammatical Collocations
نویسندگان
چکیده
This paper introduces a new technology for collocation extraction in Chinese. Sketch Engine (Kilgarriff et al., 2004) has proven to be a very effective tool for automatic description of lexical information, including collocation extraction, based on large-scale corpus. The original work of Sketch Engine was based on BNC. We extend Sketch Engine to Chinese based on Gigaword corpus from LDC. We discuss the available functions of the prototype Chinese Sketch Engine (CSE) as well as the robustness of language-independent adaptation of Sketch Engine. We conclude by discussing how Chinese-specific linguistic information can be incorporated to improve the
منابع مشابه
The Construction of a Chinese Collocational Knowledge Resource and Its Application for Second Language Acquisition
The appropriate use of collocations is a challenge for second language acquisition. However, high quality and easily accessible Chinese collocation resources are not available for both teachers and students. This paper presents the design and construction of a large scale resource of Chinese collocational knowledge, and a web-based application (OCCA, Online Chinese Collocation Assistant) which ...
متن کاملExtracting Academic Subjects Semantic Relations Using Collocations
The paper presents approach to analyze semantic content of academic subjects and its internal relations using statistically-based techniques for collocation extraction from large electronic educational text corpus. It offers a survey and analysis of some related corpus-based approaches to extract conceptual relations used for educational purpose and presents a technique for semantic search of c...
متن کاملComputing Thresholds of Linguistic Saliency
We propose and test several computational methods to automatically determine possible saliency cut-off points in Sketch Engine (Kilgarriff and Tugwell, 2001). Sketch Engine currently displays collocations in descending importance, as well as according to grammatical relations. However, Sketch Engine does not provide suggestions for a cut-off point such that any items above this cut-off point ma...
متن کاملAugmented Comparative Corpora and Monitoring Corpus in Chinese: LIVAC and Sketch Search Engine Compared
The increasing availability of numerous corpora has significantly contributed to the understanding of words in terms of their underlying semantic structures and lexical networks (e.g. COBUILD, WordNet etc.). Through data mining and information retrieval, research in this area has vastly expanded our appreciation that what constitutes lexical knowledge goes beyond synonymy, hyponymy, metonymy, m...
متن کاملLexical and Grammatical Collocations in Writing Production of EFL Learners
Lewis (1993) recognized significance of word combinations including collocations by presenting lexical approach. Because of the crucial role of collocation in vocabulary acquisition, this research set out to evaluate the rate of collocations in Iranian EFL learners' writing production across L1 and L2. In addition, L1 interference with L2 collocational use in the learner' writing samples was st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005